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A large number of top quarks will be produced at the Large Hadron Collider (LHC) during the 
Run II period. This will allow us to measure the rare processes from the top sector in great details. 

We present a study of top-quark pair production in association with a bottom-quark pair (ttbb) from 
fast simulations for the Compact Muon Solenoid (CMS) experiment. The differential distributions 
of ttbb are compared with the top-quark pair production with two additional jets ( tijj ) and with the 
production in association with the Higgs ( ttH ), where the Higgs decays to a bottom-quark pair. The 
significances of the ttbb process in the dileptonic and the semileptonic decay modes are calculated 
with the data corresponding to an integrated luminosity of 10 fb“ , which is foreseen to be collected 
in the early Run II period. This study will provide an important input in searching for new physics 
beyond the standard model, as well as in searching for the ttH process where the Yukawa coupling 
with the top quark can be directly measured. 
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I. INTRODUCTION 

After the discovery of the Higgs boson in 2012, the 
phase space of searching for the long-sought Higgs boson 
has been replaced by the phase space of measuring the 
properties of this new boson, including Yukawa coupling 
with the top quark. The fact that the top quark has 
the largest mass in the standard model has convinced us 
of its important role for checking the consistency of the 
Higgs boson with the standard model predictions. One 
of the most promising channels for a direct measurement 
of the top quark’s Yukawa coupling is the production of a 
top-quark pair in association with the Higgs boson. The 
Higgs boson which decays to bb in the standard model will 
lead to a ttbb final state. This final state, which has not 
yet been observed using Run I (2010-2012) data, has an 
irreducible nonresonant background from the production 
of a top quark pair in association with two jets faking b 
jets or with a b quark pair. 

The expected cross section for ttH in pp collisions 
at y/s = 8 TeV to next-to-leading order (NLO) is 
0.128±g;°?l (scale) ± 0.010 pb (PDF+a s ) II] Cal¬ 
culations of cr t tjj and cr t tbb have been performed 
to NLO precision. crttjj and cr t tbb predictions at 
y/s = 8 TeV are crttjj = 21-0 ± 2.9 (scale) pb and 
a = 0-23 ± 0.05 (scale) pb [2j. The ratio of cr t tbb t° 
CT t tjj is 0.011 ± 0.003. In this calculation, the additional 
jets must have transverse momenta px > 40 GeV/c and 
absolute pseudorapidity r/ < 2.5. The dominant uncer¬ 
tainties in these calculations are from the factorization 
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and the renormalization scales caused by the presence of 
two very different scales in this process, the top quark 
mass and the jet px scales. 

The first measurements of the cross sections cr t tbb 
and cr t ijj and their ratios were presented by the Com¬ 
pact Muon Solenoid (CMS) experiment at the CERN 
Large Hadron Collider (LHC) by using the full data 
sample of pp collisions at a center-of-mass energy of 
8 TeV, which corresponds to an integrated luminosity 
of 20 fb _1 . The measured cross sections of a t tbh and 
cr t tjj in the same phase space as in the theory calcu¬ 
lation are (T t tbb = 0.36 ± 0.08 (stat.) ± 0.10 (syst.) pb 
and cr t tjj — 16.1 ± 0.7 (stat.) ± 2.1 (syst.) pb, respec¬ 
tively [3j. The measured cross section ratio of cr t tbb 1° 
(j t tjj is 6.022 ± 0.004 (stat.) ± 0.005 (syst.), which is 
compatible within 1.6 standard deviation with the the¬ 
ory prediction of 0.011 ± 0.003. 

A large number of top quark candidates is expected to 
be produced at the LHC even during the early Run II 
period (2015-2017). This will allow us to measure these 
processes in great detail. Therefore performing the same 
measurement at 13 TeV is important because the exper¬ 
imental measurements of <J t ijj and production can 
provide a good test of NLO QCD theory and an impor¬ 
tant input about the main background in the search for 
the ttH process. This study will also provide useful infor¬ 
mation about the main background in the search for new 
physics beyond the standard model, such as the flavor¬ 
changing neutral current process where one of the top 
quarks decays to an up (or charm) quark and a Higgs 
boson and the Higgs boson decays to a bottom-quark 
pair. 

I_n this paper, we present a fast-simulation study of the 
ttbb process and compare it with the tijj and the ttH pro- 
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cesses. The differential distributions of additional b jets 
from ttbb events are compared with the differential dis¬ 
tributions of b jets from ttjj events and from ttH events 
where the Higgs decays to a bottom-quark pair. 


II. SAMPLES 

The simulated pp collision data samples for three pro¬ 
cesses, ttbb, ttjj and ttH , are produced separately at a 
center-of-mass energy of 13 TeV. The ttbb and ttH sam¬ 
ples are generated by using the MadGraph5_aMC@NLO 
(v2.1.2) U framework at the NLO level and are further 
interfaced with PYTHIA (v8.185) for the hadronization. 
The ttjj samples for the differential distributions are 
generated by using MadGraph5 at leading order due to 
the limit of computer resources interfaced with PYTHIA 
(v6.428) for the hadronization. For ttbb and ttH sam¬ 
ples, 100K events are produced while for ttjj sample, 
1M events are produced. 

A transverse momentum threshold of 20 GeV/c and 
a pseudo-rapidity |? 7 | < 2.5 for the additional jets that 
are not from top quarks are applied at the production 
level. The additional jets in the ttjj events include b 
quarks as well as light flavor quarks and a gluon. For 
the ttbb events, the events are generated in a 4-flavor 
scheme, where the b quark is treated as having the mass. 
In the ttbb events, there must be at least two additional 
b jets with a threshold of 20 GeV/c. In the ttH events, 
the decay of the Higgs boson is handled in PYTHIA. 
The cross sections of the ttbb , ttjj and ttH processes at 
y's = 13 TeV are calculated to NLO within the Mad- 
Graph5_aMC@NLO framework. The cross sections are 
224 ± 1.8 pb for the ttjj process, 4.7 pb for the ttbb pro¬ 
cess, and 0.32 pb for the ttH process at y/s = 13 TeV. 
These cross sections are used to calculate the significance 
by normalizing the number of events to data correspond¬ 
ing to an integrated luminosity of 10 fb -1 . 

The generated events are processed for the detector 
simulation with the DELPHES package (v3.1.2) !5j for 
the CMS detector. Similar to the CMS reconstruction, 
the objects from the particle-flow algorithm implemented 
in DELPHES are used throughout this analysis. The 
pileup events are available and can be merged in the sim¬ 
ulated events in the DELPHES package. However, in this 
analysis, we assume that pileup mitigation, which will be 
developed at the CMS experiment, can reduce the effect 
from pileup events significantly. It is also important to 
understand the physics difference in the case of no pileup 
events. Therefore, we focus on only the physics under 
the condition that there is no pileup effect. 

In the DELPHES fast simulation, the final momenta of 
all the physics objects, such as electrons, muons and jets, 
are smeared as a function of px and rj so that they can 
represent the detector effects in the CMS experiment. 
The reconstruction efficiencies of the electrons, muons 
and jets are also parameterized as functions of px and 77 
based on information from the measurements using Run 


1 data. The muon identification efficiency is set to 95% 
for the muons with momenta px > 10 GeV /c and px < 
100 GeV/c. The electron identification efficiency is set 
to 95% for |?y| > 1.5 and 85% for 1.5 < |? 7 | < 2.5 The 
isolated muons and electrons are selected by applying a 
relative isolation of I re i < 0 . 1 , where I re i is defined as 
the sum of the surrounding energy from the particle-flow 
tracks, photons and neutral hadrons divided by the trans¬ 
verse momentum of the muon or electron. The particle- 
flow jets used in this analysis are clustered by using the 
particle-flow tracks and particle-flow towers. If the jet 
is already reconstructed as an isolated electron, muon or 
photon, the jet is excluded from further consideration. 
The b-tagging efficiency parameterized as a function of 
Px and 77 of the jet ranges from 20% to 50%. The fake b- 
tagging rate from the light flavor jet is set to 0 . 1 %, which 
corresponds to the tight-working point in the CMS mea¬ 
surement [ 6 ]. 


III. DILEPTON ANALYSIS 

In order to constrain the phase space to that we can 
experimentally measure in the dileptonic decay mode, we 
applied the following event selections at the reconstruc¬ 
tion level. Events should have at least two leptons with 
px > 20 GeV/c and |? 7 | < 2.4 (SI). Four reconstructed 
jets with px > 30 GeV/c and |??| < 2.5 are required (S2). 
Two b-tagged jets are required to select the tt events 
(S3). After this selection step, based on the experimen¬ 
tal result from the CMS experiment we would still 
have remaining backgrounds from single-top events and 
Drell-Yan events. In order to remove these possible re¬ 
maining backgrounds and to be sure we have only ttjj , 
ttbb and ttH events after the final selection, we further 
require the event to have one more b-tagged jet, adding 
up to at least three b-tagged jets (S4). 

In the best scenario where we can identify the origin 
of the b jets, we can see the potential difference in the 
kinematic distributions of the additional jets not from 
top quarks for the ttbb , ttjj and ttH processes. For 
this purpose, we rely on the Monte Carlo information to 
identify the origin of the jets as to whether or not they 
come from a top quark. The additional jets not from 
top quarks are identified by using the geometric informa¬ 
tion A R(j, q) = vV > 2 + if , requiring A R <0.5, where j 
denotes jets at the reconstruction level and q denotes a 
quark that does not originate from top quarks. If the jet 
matches the b quark, it is treated as a b jet. Otherwise, 
the jet is treated as a light flavor jets. 

Plots for this analysis are shown at the preselection 
owing to the lack of statistics in our simulated samples 
and to the tight selection based on the assumption that 
the distributions would not be significantly different after 
the event selections. Figure [l] shows the transverse mo¬ 
mentum and pseudo-rapidity distributions for two jets 
that are not from top quarks. The graphs in Fig. [l] show 
that the additional b jets in the ttbb event tend to have 
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TABLE I: Expected number of events in the dileptonic decay 
mode normalized to the data corresponding to an integrated 
luminosity of 10 flY 1 with the NLO cross sections. The renor¬ 
malization and the factorization scale uncertainties are shown 
only at the final selection. The acceptance (e) after the final 
selection is also presented. 


Process two leptons jets > 4 b-tag > 2 b-tag > 3 e (%) 


ttbb 

520 

308 

118 

33 ± 0.8 

1.5 

ttjj 

30183 

10285 

1712 

97 ± 16 

0.09 

ttH 

39 

21 

7.8 

2.0 ± 0.1 

1.3 


softer px compared to those from the ttH and the ttjj 
events. The pseudo-rapidity distributions from the ttbb 
events are shown to be more central than those from the 
other processes. The invariant mass and A R distribu¬ 
tions for these two jets are also presented in Fig. [2j As 
expected, the invariant mass of the two b jets from the 
Higgs boson in the ttH process has a clear peak around 
125 GeV/c 2 , the simulated mass in the ttH sample, while 
that is not the case for the other two processes, ttbb and 
ttjj. The A R distribution shows a clear distinguishing 
feature. The additional jets from the ttbb process have 
a narrow angle between jets while those from the ttjj 
events have a wider angle and those from the ttH events 
are in the middle. 

Table |T] shows the expected number of events at each 
event selection step normalized to the data corresponding 
to an integrated luminosity of 10 fb -1 by using the NLO 
cross sections for each process. The acceptance after the 
final selection is also presented. The expected number 
of events from the ttbb process after full selection is 33 
events. The expected contributions from the ttH process 
and the ttjj process after the full selection are 2.0 events 
and 97 events, respectively. The significance of s/y/s + b, 
where s is the number of signal events (ttbb) and b is the 
sum of ttjj and ttH background events, is 2.9 without 
taking into account the systematic uncertainty. 


IV. LEPTON + JETS ANALYSIS 



First Additional Jet p y (GeV/c) 



Second Additional Jet p T (GeV/c) 



First Additional Jet r| 


We also performed a study of the semileptonic decay 
mode. The phase space in this study is also constrained 
to the semileptonic decay mode in what we can make ex¬ 
perimental measurement. The following event selections 
at the reconstruction level are applied. Events should 
have exclusively one lepton with px > 40 GeV/c and p 
< 2.4 (SI). At least six reconstructed jets with px > 30 
GeV/c and |p| < 2.5 are required (S2). Two b-tagged 
jets are required to select the tt_events (S3). In order 
to be sure that we have ttjj, ttbb and ttH events after 
the final event selection, we further require each event to 
have one more b-tagged jet, adding up to at least three 
b-tagged jets (S4). 

The same strategy is applied in the semileptonic decay 
mode. We identified two jets not from top quarks by 



Second Additional Jet ri 

FIG. 1: Jet px (top) and p distributions (bottom) for the first 
and the second additional jets which are not from top quarks 
are shown in the dileptonic decay mode. The red solid line 
indicates the ttbb process. The purple dotted line shows the 
ttH process. The ttjj process is indicated by a blue dashed 
line. 
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FIG. 2: Invariant mass (left) and A R (right) distributions for 
two additional jets not from top quarks for the t.ibb, ttjj and 
ttH processes are shown in the dileptonic decay mode. The 
red solid line indicates the ttbb process. The purple dotted 
line shows the ttH process. The ttjj process is indicated by 
a blue dashed line. 


using the Monte Carlo information. The additional jets 
not from top quarks are identified by using the geometric 
information A R(j,q) < 0.5 in the same fashion as were 
done in the dileptonic decay mode. 

Plots in this analysis are shown at the preselection ow¬ 
ing to the lack of statistics in our simulated samples and 
to the tight selection based on the assumption that the 
distributions would not be significantly different after the 
event selections. Figures [3] show the transverse momen¬ 
tum and pseudo-rapidity distributions for two jets that 
are not from top quarks. These show that the additional 
b jets in the ttbb and ttH events tend to have softer px 
and central spectra compared to those from ttjj events. 
The invariant mass and the A R distributions for these 
two jets are also presented in Fig. |4j As expected, the 
invariant mass of the two b jets from the Higgs boson in 
the ttH process has a clear peak around 125 GeV/c 2 , the 
simulated mass in the ttH sample, while that is not the 
case for the other two processes, ttbb and ttjj. The A R 
distribution shows a clear distinguishing feature. The ad¬ 
ditional jets from ttbb have a narrow angle between jets 


TABLE II: Expected number of events in the semileptonic 
decay mode normalized to the data corresponding to an in¬ 
tegrated luminosity of 10 fb“ with the NLO cross sections. 
The renormalization and the factorization scale uncertainties 
are shown only at the final selection. The acceptance (e) after 
the final selection is also presented. 


Process 

one lepton 

jets > 6 b-tag > 2 

b-tag > 3 

e(%) 

ttbb 

5357 

2138 

970 

285 ± 27 

0.6 

ttjj 

273785 

47297 

12238 

2149 ± 549 

0.04 

ttH 

508 

163 

63 

21 ± 2 

0.5 


while for the ttjj events, they have a wider angle and for 
the ttH , they are in the middle. 

The distributions have features similar to these of the 
distributions in the dileptonic decay mode. However, the 
pseudorapidity of two additional jets from ttjj events 
are shown to be more central compared to the one for 
the dileptonic decay mode. 

Table [IT] shows the expected number of events at each 
event selection step normalized to the data corresponding 
to an integrated luminosity of 10 flu 1 by using the NLO 
cross sections for each process. The acceptance after the 
final selection is also presented. The expected number 
of events from the ttbb process after the full selection is 
285 events. The expected contributions from the ttH 
process and the ttjj process after the full selection are 
21 events and 2149 events, respectively. The significance 
of s/y/s + 6, where s is the number of signal events (ttbb) 
and b is the sum of ttjj and ttH background events, is 5.8 
without taking into account the systematic uncertainty. 

V. DISCUSSION 

Differential kinematic distributions of the additional 
jets from the ttbb process are compared with the addi¬ 
tional jets from the ttjj process and with the two b jets 
from the Higgs boson from ttH events where H decays to 
bb at y/s = 13 TeV. In the best scenario where the b jet 
assignment is correct, the study clearly shows that the in¬ 
variant mass and the A R variables of the two additional 
jets are the promising variables for separating out those 
three processes. In reality, identifying the origin of the b 
jets at the reconstruction level with the real data events 
remains challenging. Because in a real data analysis, no 
single best variable would be able to separate the ttH 
process from the ttbb or ttjj process, this would require 
us to perform a multivariate analysis to increase the sig¬ 
nificance. In the multivariate analysis, the variables that 
we presented here can be used as input variables. 

We also discuss the significance of ttbb with the data 
corresponding to an integrated luminosity of 10 ftu 1 
which is foreseen to be collected during the early Run 
II period. The significances are 2.9 and 5.8 for the dilep¬ 
tonic decay mode, and the semileptonic decay mode, re¬ 
spectively. However, if we take into account the system- 
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First Additional Jet p T (GeV/c) 



Second Additional Jet p T (GeV/c) 



First Additional Jet r| 



Second Additional Jet ri 

FIG. 3: Jet px (top) and r/ distributions (bottom) for the 
first and the second additional jets which are not from top 
quarks are shown in the semileptonic decay mode. The red 
solid line indicates the ttbb process. The purple dotted line 
shows the tiH process. The ttjj process is indicated by a blue 
dashed line. 




FIG. 4: Invariant mass (left) and A R (right) distributions for 
two additional jets not from top quarks for the ttbb, ttjj and 
ttH processes are shown in the semileptonic decay mode. The 
red solid line indicates the ttbb process. The purple dotted line 
shows the ttH process. The ttjj process is indicated by the 
blue dashed line. 


atic uncertainty of the tt cross section from Run 1 mea¬ 
surements, 5% for the dileptonic events [7j and 7% for the 
semileptonic events the significance would go down. 
From the following formula for the significance taking 
into account the systematic uncertainty, 


s/yjs + b + (b x unc.) 2 , (1) 


where unc. indicates the systematic uncertainty of the 
background, the significances go down to 1.9 for the dilep¬ 
tonic decay mode and 1.8 for the semileptonic mode. The 
significances become compatible with each other for both 
channels. This requires us to use a more complex ap¬ 
proach and the precise measurements of tt pair produc¬ 
tion in order to maintain a high significance at y/s = 13 
TeV with the data corresponding to an integrated lumi¬ 
nosity of 10 fb -1 . 
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VI. CONCLUSION 

We present a study of top-quark pair production in as¬ 
sociation with a bottom-quark pair from fast simulations 
using the CMS detector. The invariant mass and A R 
variable of two additional jets are promising variables to 
separate ttbb events from ttjj and ttH events. However, 
identifying the origin of the b jets at the reconstruction 
level with real data events is challenging. With the 10 
fb _1 data, in the best scenario with a simple cut and 
count method, we can reach 2.9 and 5.8 standard devia¬ 
tions for the dileptonic decay mode and the semileptonic 
decay mode, respectively. When the systematic uncer¬ 
tainty for the cross section measurement of the tt pro¬ 
duction is taken into account, the significance goes down 
to 1.9 for the dileptonic mode and 1.8 for the semileptonic 
mode, so the values of the significance becomes compat¬ 
ible with each other. This would require us to measure 
the tt pair production cross section precisely at y/s = 
13 TeV and to use a more complex approach, such as 
a multivariate technique, for the ttbb cross section mea¬ 


surement. 
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